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A new multivariate concept of quantile, based on a directional 
version of Koenker and Bassett's traditional regression quantiles, is 
introduced for multivariate location and multiple-output regression 
problems. In their empirical version, those quantiles can be com- 
puted efficiently via linear programming techniques. Consistency, Ba- 
hadur representation and asymptotic normality results are estab- 
lished. Most importantly, the contours generated by those quantiles 
are shown to coincide with the classical halfspace depth contours as- 
sociated with the name of Tukey. This relation does not only allow 
for efficient depth contour computations by means of parametric lin- 
ear programming, but also for transferring from the quantile to the 
depth universe such asymptotic results as Bahadur representations. 
Finally, linear programming duality opens the way to promising de- 
velopments in depth-related multivariate rank-based inference. 

1. Introduction: Multivariate quantiles and statistical depth. In this pa- 
per, we propose a definition of multivariate quantiles/multiple-output regres- 
sion quantiles enjoying all the probabilistic and analytical properties one is 
generally expecting from a quantile, while exhibiting a very strong and fun- 
damental connection with the concept of halfspace depth. Some of the basic 
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ideas of this definition were exposed in an unpublished master thesis by 
Laine [21], quoted in [16]. In this paper, we carefuhy revive Laine's ideas, 
and systematically develop and prove the main properties of the concept he 
introduced. 

A huge literature has been devoted to the problem of extending to a 
multivariate setting the fundamental one-dimensional concept of quantile; 
see, for instance, [1, 3-7, 10, 15, 19, 34] and [37] or [33] for a recent sur- 
vey. An equally huge literature — see [9, 22, 39] and [40] for a comprehensive 
account — is dealing with the concept of (location) depth. The philosophies 
underlying those two concepts at first sight are quite different, and even, 
to some extent, opposite. While quantiles resort to analytical characteriza- 
tions through inverse distribution functions or Li optimization, depth often 
derives from more geometric considerations such as halfspaces, simplices, 
ellipsoids and projections. Both carry advantages and some drawbacks. An- 
alytical definitions usually bring in efficient algorithms and tractable asymp- 
totics. The geometric ones enjoy attractive equivariance properties and in- 
tuitive contents, but their probabilistic study and asymptotics are generally 
trickier, while their implementation, as a rule, leads to heavy combinatorial 
algorithms; a highly elegant analytical approach to depth has been proposed 
in [24], but does not help much in that respect. 

Yet, beyond those sharp methodological differences, quantiles and depth 
obviously exhibit a close conceptional kinship. In the univariate case, all 
definitions basically agree that the depth of a point x G R with respect to 
a probability distribution P with strictly monotone distribution function F 
should be min(i<'(x),l — F(x)), so that the only points with depth d are 
Xd := F~^{d) and xi-d := ^"-"^(1 — d) — the quantiles of orders d and 1 — d, 
respectively. Starting with dimension two, no such clear and undisputable 
relation has been established so far — how could there be one, by the way, as 
long as no clear and undisputable definition of a multivariate quantile has 
been agreed upon? Bridging the gap between the two concepts thus would 
allow for transferring to the depth universe the analytical and algorithmic 
tools of the quantile approach, while sorting out the many candidates for a 
sound definition of multivariate quantiles. Establishing a relation between 
the quantile and depth philosophies in M'^, if at all possible, therefore is 
highly desirable. 

An important step in that direction has been made very recently in a 
paper by Kong and Mizera [20]. Kong and Mizera adopt a very simple and, 
at first sight, quite natural projection-based definition of quantiles. In that 
approach, denoting by u a point on the unit sphere S^~^ , a quantile of or- 
der T G (0,1) is either a real number qkMitu £ R ('^'km tu empirical 
case), the point qKM;Tu := gKM;TuU G M'' (resp., Qkm;™)' hyperplane 
(resp., 7r|^M-Tu) orthogonal to u at qKM;Tu (resp., Qkm-^u)- The scalar 
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quantity (7KM;ru G K is defined as the quantile of order r of the univariate 
distribution obtained by projecting P onto the oriented straight hne with 
unit vector u, and therefore derives from purely univariate Li arguments; see 
Section 4.3 for details. The resulting quantile contours (the collections, for 
fixed r, of qKM;ru's) do not enjoy the properties (independence with respect 
to the choice of an origin, affine-equivariance, nestedness, etc.) one is expect- 
ing from a quantile concept. However, somewhat surprisingly, the envelopes 
of these contours — namely, the inner regions characterized by the (infinite) 

fixed-r collections of ttkmitu's (resp., Trj^j^.^^'s) — coincide with Tukey's half- 
space depth regions, which provides a most interesting, though somewhat 
indirect, conceptual bridge between the two concepts. 

Our quantiles also are associated with unit vectors u G S^~^, hence also 
are directional quantiles. However, instead of projecting onto the straight 
line defined by u, we stay in a /c-dimensional setting, where u simply indi- 
cates the reference "vertical" direction for a regression quantile construction 
in the Koenker and Bassett [18] style. As in [18], our quantiles thus are hyper- 
planes vTt-u (vTru^ in the empirical case); in contrast with 7rKM;ru and Trj^^i.^^, 

(n) ' 

however, the fixed-u collections of tTtuS and vr^u s are not collections of 
parallel hyperplanes all orthogonal to u. Whereas projection quantiles only 
involve univariate Li arguments, ours indeed rely on fully /c-dimensional Li 
optimization. As shown in Section 4, the inner regions characterized by the 
fixed-T collections of tTtuS (resp., tt^u^'s) also coincide with Tukey's halfspace 
depth regions. Contrary to Kong and Mizera's, however, the tTtu quantile 
hyperplanes do enjoy all the desirable properties of a well-behaved quantile 
concept. And, in the empirical case, our quantile hyperplanes and the faces 
of Tukey's (polyhedral) depth contours essentially coincide, in the sense that 

the latter constitute a (finite) subcollection of the finite collection of vTru^'s, 

(n) 

itself a finite subcollection of the infinite collection of tt^M tu'^- 

From their Li definitions, the ir^^s and 7r|^2i ,rjj's both inherit a proba- 
bilistic interpretation allowing for tractable asymptotics: consistency, Ba- 
hadur representations and asymptotic normality. From their relation to 
depth, the resulting contours acquire a series of nice geometric properties 
such as convexity, nestedness and affine-equivariance; and, since empirical 
Tukey depth contours fully characterize the empirical distribution (see [35] ) , 
our quantile contours (as well as Kong and Mizera's) also do. Above all, 
our quantiles receive the important benefits of linear programming algo- 
rithms, which thereby automatically transfer to depth, hence — indirectly, 
though (see [26]) — also to the Kong and Mizera concept. Moreover, both 
concepts readily generalize to the regression setting, yielding nested poly- 
hedral regions wrapping, up to the classical quantile crossings, a median 
or deepest regression hypertube (see [26] for a detailed comparison of our 
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regression quantile hypertubes and those resulting from the Kong and Miz- 
era approach). This extends to the multiple-output context the celebrated 
single-output Koenker and Bassett concept of regression quantiles. Con- 
versely, as indicated in [26] it also leads to a concept of multiple- output 
regression halfspace depth; that depth concept, however, has the nature of 
a point regression depth, hence is distinct from the Rousseeuw and Hubert 
regression depth concept (see [29]), which is a hyperplane depth concept. 
A constrained optimization form of the definition of also allows for 
computing Lagrange multipliers with most interesting statistical applica- 
tions. Finally, by resorting to classical linear programming duality, a con- 
cept of directional regression rank scores, allowing for multivariate versions 
of the methods developed in [12], naturally comes into the picture. 

From an applied perspective, the possibility of computing Tukey depth 
contours via parametric linear programming is not a small advantage. The 
complexity of computing the depth of a given point is 0(n'^~^ log n), with 
algorithms by Rousseeuw and Ruts [28] for A: = 2 and Rousseeuw and Struyf 
[31] for general k. The best known algorithm for computing all depth con- 
tours has complexity 0{n'^) (see [23]) in dimension k = 2. To the best of our 
knowledge, no exact implementable algorithm is available so far for k > 2. 
Our approach allows for higher values of k, and we could easily run our 
algorithms in dimension k = 5, for a few hundred observations. 

The paper is organized as follows. Section 2 introduces the definitions 
and main notation to be used throughout. In Section 3, we study the main 
properties of the new quantiles: from their directional quantile nature, they 
inherit subgradient characterizations (Section 3.1), equivariance properties 
(Section 3.2), and quantile-like asymptotics — strong consistency, Bahadur 
representation and asymptotic normality (Section 3.3). In Section 4, we es- 
tablish the equivalence of the quantile contours, thus obtained with the more 
traditional halfspace (or Tukey) depth contours, as well as their relation to 
the recent results by Kong and Mizera [20] and Wei [37] . Section 5 is devoted 
to the computational aspects of our multivariate quantiles, and Section 6 to 
their extension to a multiple-output regression context. A brief application 
to real data is discussed in Section 7. Section 8 concludes with some per- 
spectives for future research. Proofs are collected in the Appendix. 

2. Definition and notation. Consider the /c-variate random vector Z := 
{Zi, . . . , Zk)' . The multivariate quantiles we are proposing are directional 
objects — more precisely, {k— l)-dimensional hyperplanes indexed by vectors 
T ranging over the open unit ball (deprived of the origin) := {z (^W^ :0 < 
||z|| < 1} of M.^. This directional index t naturally factorizes into r =: ru, 
where r = ||r || S (0, 1) and u G := {z S M'^ : ||z|| = 1}. Denoting by Fu 
an arbitrary k x (k — 1) matrix of unit vectors such that (uiFu) constitutes 
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an orthonormal basis of M'^ , we define the r-quantile of Z as the regression 
r-quantile hyperplane obtained (in the traditional Koenker and Bassett [18] 
sense) when regressing Zu := u'Z on the marginals of Z^ := F^Z and a 
constant term: the vector u therefore indicates the direction of the "vertical" 
axis in the regression, while simply provides an orthonormal basis of the 
vector space orthogonal to u. More precisely, denoting by x i— )• Pr{x) := 
x{t — I[a;<o]) the usual r-quantile check function, we adopt the following 
definition. 

Definition 2.1. The r-quantile of Z (r =: ru e B^) is any element of 
the collection I!,- of hyperplanes tTt- := {z G M'^ : u'z = bi^F^z + a-r} such that 

(aT-,b^)'G argmin ^'T-(a,b) 

(2.1) 

where ^T(a,b) := E[p^(Zu - b'Z^ - a)]. 

This definition tacitly requires the existence, for Z, of finite first-order 
moments: see the comment below. For the sake of notational simplicity, 
quantiles, here and in the sequel, are associated with a random vector Z, 
though they actually are attributes of Z's probability distribution P. 

Definition 2.1 clearly extends the traditional univariate one. For A; = 1, 
indeed, hyperplanes of dimension k — 1 are simply points, reduces to 
(—1,0) U (0,1) and tTt to a "classical" quantile, of order 1 — ||r|| (r point- 
ing to the left) or ||r|| (r pointing to the right). This couple of quantiles 
constitutes (for A; = 1) a quantile contour, indicating that a sensible relation 
between depth and quantiles should associate depth contours with contour- 
valued rather than with point-valued quantiles. 

Note that the quantile hyperplanes ir-r and the "intercepts" a-r are well 
defined in the sense that they only depend on r, not on the coordinate 
system associated with the (arbitrary) choice of Fu- However, the "slope" 
coefficients b,- = bT-(Fu) do depend on Fu, a dependence we do not stress 
in the notation unless really necessary. 

Each quantile hyperplane tTt [each element (a^-, b!^)' of argmin^^ j3')'eRfc ^t(«j 
b)] characterizes a lower (open) quantile halfspace 

(2.2) = (a^, b^) := {z G : u'z < b;F;,z + a^} 

and an upper (closed) quantile halfspace 

(2.3) = H+{a^,h^) := {z G M^' : u'z > KT'^z + a^}. 

As already mentioned. Definition 2.1 requires Z to have finite first-order 
moments. Actually, modifying (2.1) into (aT,b!^)' G argmin(-^ ,3/ygjgfe(^'T-(a,b) — 
^'t-(0, 0)) has no impact on tTt, while allowing to relax the moment condition 
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on Zu; finite first-order moments, however, still are required for . When u 
ranges over S^~^ — for instance, when defining quantile contours — we need 
finite first-order moments for all Z^'s, hence for Z itself. For the sake of 
simplicity, we often adopt the following assumption in the sequel. 

Assumption (A) . The distribution of the random vector Z is absolutely 
continuous with respect to the Lebesgue measure on M*-', with a density (/, 
say) that has connected support, and admits finite first-order moments. 

The minimization problem (2.1) may have multiple solutions, yielding 
distinct hyperplanes tTt- This, however, does not occur under Assumption 
(A), as shown in the following result, which is a particular case of Theorem 
2.1 in [26]. 

Proposition 2.1. Let Assumption (A) hold. Then, for any r G , the 
minimizer {a-T-jh'^)' in (2.1), hence also the resulting quantile hyperplane 
TT-r, is unique. 

The family of hyperplanes IT = {tTt : r = ru € B^} can be considered from 
two different points of view. The directional point of view, associated with 
the fixed-u subfamilies IIu := {tTt : r = ru, r G (0, 1)} is the one emphasized 
so far in the definition, and provides, for each u, the usual interpretation 
of a collection of regression quantile hyperplanes. Another point of view is 
associated with the fixed-r subfamilies 11,- := {tTt : r = ru, u S S''~^}, which 
generate quantile contours: this point of view is developed in Section 4. 

Before turning to the empirical version of our quantiles, let us present an 
alternative (but strictly equivalent) definition of our r-quantiles, based on 
a constrained optimization formulation. 

Definition 2.2. The r-quantile of Z (r =: ru G B'') is any element of 
the collection 11,- of hyperplanes tTt := {z G M'^ : c'^z = Qt} such that 

(2.4) {a.r,c'^y £ argmin 'I'^(a,c), 

{a,c'YeMu 

where 1-^(0,0) := E[p^(c'Z - a)] and Mu := {(a, c')' G M'^'+i : u'c = 1}. 

Clearly, if (aT-,b!^)' is a minimizer of (2.1), then {arjC'^Y '■= {aT,{u — 
Tubx)')' minimizes the objective function in (2.4); conversely, for any mini- 
mizer (aT,c'^)' of (2.4), {aT,h'^y := (a-r, {—T'^Ct)')' minimizes the objective 
function in (2.1). The two definitions thus coincide; in particular, the lower 
and upper quantile halfspaces {z G M'^ : c'^z < Ot} and {z G : c'^z > Qt} as- 
sociated with the quantile hyperplanes of Definition 2.2 coincide with those 
in (2.2) and (2.3), and therefore, depending on the context, the notation 
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i/^(oT-, b,-), //^(ot, Ct), or simply will be used indifferently. Definitions 
2.1 and 2.2 both have advantages and, in the sequel, we use them both. Defi- 
nition 2.1 is preferred in this section since it carries all the intuitive contents 
of our concept; the advantages of Definition 2.2, of an analytical nature, will 
appear more clearly in Sections 3.1 and 5. 

The empirical versions of our quantile hyperplanes and the corresponding 
lower and upper quantile halfspaces naturally follow as sample analogs of 
the population concepts. To be more specific, let Z^*^) := (Z 1, . . . , Z„) be an 
n-tuple {n > k) of fc-dimensional random vectors: we define the empirical r- 

quantile of Z^") as any element of the collection 11^^ of hyperplanes vri"^ := 
{z G M'^ : u'z = b^^'r^z + a^^} such that (with obvious notation) 

[a)- ,bV ) G argmm (a,b) 

, ^ (a,b')'e 

(2.5) 



n 



With (a, b) := i ^p,(Z,u - b'z4 - a), 



n ■ 
1=1 



or equivalently, of hyperplanes vri"^ := {z G : Cx"^'z = a^^} such that 

/ (n) (n)l\i _ . ^T.cin) , \ 

{a-r -jCt ) G argmm Wt- (a,cj 

(2.6) 

1 ^ 

with ^-^^"^(0,0) :=-^/9^(c'Z, -a) 

(no moment assumption is required here) . These empirical quantiles — which 
for given u clearly coincide with the Koenker and Bassett [18] hyperplanes 

in the coordinate system (uiFu) — allow for defining, in an obvious way, 

the empirical analogs and H^^'^ of the lower and upper quantile 

halfspaces in (2.2) and (2.3); see Figures 1 and 2 for an illustration. 

Of course, empirical distributions are inherently discrete, and empirical 
T-quantiles and halfspaces in general are not uniquely defined. However, the 
minimizers of (2.5) [equivalently, of (2.6)], for given r, are "close to each 
other," in the sense that the set of minimizers is convex — hence, connected 
(this readily follows from the fact that the objective functions are convex); 
this set is shrinking, as n— )• 00, to a single point which corresponds to the 
uniquely defined population quantile, provided that the following assump- 
tion is fulfilled (see the asymptotic results of Section 3.3 for details). 

Assumption (A„). The observations Z^, i = l,...,n are i.i.d. with a 
common distribution satisfying Assumption (A). 



Finally, note that, since the empirical versions of our quantiles, for given 
u, are defined as standard single-output quantile regression hyperplanes, 
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-0,5 5 -0,5 5 



Fig. 1. The left plot contains n = 9 (red) points drawn from [/([— 0.5, 0.5]^), the centered 
bivariate uniform distribution over the unit square, and provides all r-quantile hyperplanes 
for T — 0.2. These hyperplanes define a polygonal central region (green contour) which, in 
Section 4-, is shown to coincide with a Tukey depth region. The quantile hyperplanes con- 
tributing an edge to the polygonal central region are shown in magenta; those associated 
with the four semiaxial directions in black; all other ones in blue. The right plot pro- 
vides the same information for n — 499 (invisible) points drawn from the same population 
distribution. 

they inherit the hnear programming features of the Koenker-Bassett theory. 
This certainly is one of the most important and attractive properties of the 
proposed quantiles; see Section 5 for details. 

3. Multivariate quantiles as directional quantiles. In this section, we de- 
scribe the "directional" properties of our quantiles. We first derive and dis- 




-0.5 -0.5 



Fig. 2. This plot provides six r-quantile hyperplanes (in black) in the semiaxial directions 
for T = 0.1, computed from n = 49 (red) points drawn from f7([— 0.5,0.5]^), the centered 
trivariate uniform distribution over the unit cube, along with the corresponding central 
(Tukey depth) region (in green). 
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CUSS the subgradient conditions associated with the optimization problems 
in Definitions 2.1 and 2.2, then state the strong equivariance properties of 
our empirical quantiles, and finally present some asymptotic results. 

3.1. Subgradient conditions. Under Assumption (A), the objective func- 
tion appearing in Definition 2.1 is convex and continuously differentiable 
on M*"'. Therefore, our population r-quantiles can be equivalently defined as 
the collection of hyperplanes associated with the solutions (aT-,b^)' of the 
system of equations 

(3.1) grad(„^b')'*T(a,b)=0 

(see Sections 2.2.1 and 2.2.2 in [16]). These hyperplanes thus are character- 
ized by the relations 

(3.2a) = ida^ria,h))^a^^Ky = P[u'Z < b;r'„Z + ar]-T 

(3.2b) = (gradb^,(a,b))(,^,b;)' = -tE[T'^Z] +E[T'^ZI^^^j,-^^^^^^^^]. 

Clearly, relation (3.2a) provides our multivariate r-quantiles with a nat- 
ural probabilistic interpretation, as it keeps the probability of their lower 
halfspaces equal to r(= ||r||). As for relation (3.2b), it can be rewritten as 



(3.3) r'. 



1 -r 



0, 



which — combined with (3.2a) — shows that the straight line through the 
probability mass centers iE[ZIj2g/f-]] and Y^E[ZIj2g//+]] of the lower and 
upper T-quantile halfspaces is parallel to u(:= t/t). Note moreover that, 
quite trivially. 



(1-r) 



so that the overall probability mass center also belongs to the same straight 
line. 

Now consider the gradient conditions associated with Definition 2.2, which 
state that (a,-, c'^, A,-)' are solutions of the system 

grad( , ;^), Lt-(o,c,A) = 

(3.4) 

with Li-(a, c, A) := ^^(a, c) - A(u'c - 1) 

(the Lagrangian function of the problem). Equivalently [indeed, the only 
points in R*"'"*"^ where (a, c', A)' i— )• LT-(a, c. A) is not continuously differen- 
tiable are of the form (0, 0', A)', hence cannot be associated with a minimum 
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of (2.4)], the latter gradient conditions can be rewritten as 
(3.5a) = {daLT{a, c, A))(„^^c;,A^)' 

= P[c;Z <ar]-T = P[Z G i/^ (a^, c^)] - r, 
(3.5b) = (grad, L,(a, c, A))(,^,e;,;,^y =rE[Z] -E[ZI[2g^,-(,^^,^)]] -A^u, 

(3.5c) = (5AL^(a,c,A))(a^,c;,A^)' = l-u'c-r. 

For such a constrained optimization problem, gradient conditions in general 
are necessary but not sufficient. In this case, however, note that premulti- 
plying both sides of (3.5b) by F'^ yields (3.2b), which clearly implies that, 
disregarding the Lagrange multiplier A,- and (3.5c) to focus on (the coeffi- 
cients of) the quantile hyperplane tTt, the necessary conditions (3.5a) and 
(3.5b) are no weaker than the necessary and sufficient ones in (3.2a) and 
(3.2b), hence are necessary and sufficient, too. 

The gradient conditions (3.4) associated with Definition 2.2 are, in a sense, 
richer than those (3.1) associated with the original definition of our quan- 
tiles, which is actually one of the main reasons why we also consider that 
alternative definition. Indeed, (3.5b), which can be rewritten as 

(3-6) Y^E[ZI[2,^+j] - ^E[ZI[^,^-j] = :^(Y^u, 

is more informative than (3.2b)-(3.3), and clarifies the role of the Lagrange 
multiplier A,-. Such a multiplier, which in general only measures the impact 
of the boundary constraint [in this case, the constraint (3.5c)], here appears 
as a functional that is potentially useful for testing (central, elliptical, or 
spherical) symmetry or for measuring directional outlyingness and tail be- 
havior of the distribution; see Section 8. Moreover, premultiplying (3.5b) 
with c'^ yields At-(c!^u) = E[{t — I[f.'^2,~a.r<o])^T'^]^ that is, by using (3.5a) 
and (3.5c), 

(3.7) X^ = ^Ua^,c^), 

so that At is nothing but the minimum achieved in (2.4) [equivalently, in 
(2.1)]. 

The sample objective functions ^'i"''(a,b) and '^'^^\a,c) in (2.5) and 
(2.6) are not continuously differentiable. They however have directional 
derivatives in all directions, which can be used to formulate fixed-u sub- 
gradient conditions for the empirical r-quantiles, r = ru. Focusing first 
on the constrained optimization problem (2.6), it is easy to show that the 
coefficients (a^\c^^')' and the corresponding Lagrange multiplier A^^ of 
any empirical r-quantile vr^"^ = {z G M'^ : c^^'z = a^^} must satisfy (letting 
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(n) M'ry (") ■ 1 \ 



(3.8a) 



(3.8b) 



(3.8c) 



i=l i=l 



1 

n ^ 

i= 

1 " 



1 " 



< T 



< 



n 



0=1 



- n 

-Ez^ 




"l 












i=l 


(")=0] ™^ 




- u c V , 








,max(zfc,0))' 


and z := (- 


- min(zi, 0) 



where z"*" := (max(zi,0), 
— mm(zk,0)y . These necessary conditions are obtained by imposing that, 

at {a^\ c^^' , X^^y , directional derivatives in each of the 2[k + 2) semi-axial 
directions of the (a, c', A)'-space be nonnegative for (a, c')' and zero for A. 

For k, we clearly may interpret (3.8a) and (3.8c) as an approximate 
version of their population analogs (3.5a) and (3.5b), roughly with the same 
consequences [the condition (3.8c) simply restates our boundary constraint]. 
More specifically, (3.8a) indicates that 



(3.9) 



N 



<T< 



N + Z 



hence 



P 



<l-r< 



P + Z 



n n n n 

where N ^ P and Z are the numbers of negative, positive and zero values, 



respectively, in the residual series , i = l,...,n. This implies that, for 
noninteger values of nr, empirical r-quantile hyperplanes have to go through 
some of the Zj's. Actually, if the data points are in general position [which 
of course holds with probability one under Assumption (A„)], there exists 

a sample r-quantile hyperplane vri"^ which fits exactly k observations; (3.9) 
then holds with Z = k (see Sections 2.2.1 and 2.2.2 of [16]). Note that the 
inequalities in (3.8a) and (3.8c) [hence, also in (3.9)] must be strict if the 
sample r-quantile is to be uniquely defined. Finally, as we will see in (5.2) 

(n) 

below, the value of AV , parallel to the population case, is the minimal one 
that can be achieved in (2.6), hence also in (2.5). 

For the unconstrained definition of our empirical quantiles in (2.5), nec- 
essary and sufficient subgradient conditions can be obtained by applying 
Theorem 2.1 of [16], since (2.5) is nothing but a standard single-output 
quantile regression optimization problem. Assuming that the data points are 
in general position and defining, for any /c-tuple of indices h = (ii, . . . ,1^), 
1 < ii < • • • < ifc < n. 



(3.10) Yu(/i) :=Z'(/i)u and Xu{h) := {lk:Z'{h)r^), 
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where Z(/i) := (Zj^ , . . . , Zj^^) and Ik = (1,...,!)' gM^, Koenker's result, in 

the present context, states that {a^\h^^'y = (Xu(/i))~^Yu(/i) (we just 
pointed out that, under such conditions, there always exists a quantile hy- 
perplane fitting exactly k observations) is a solution of (2.5) if and only 
if 

(3.11) -rU<^^(/i)<(l-T)U, 
where 

(3.12) ^,(/i):=(XU/i))-^^(r-I[,,<o])(j./2,) 

with := u'Zj — b^^r^Zj — . Again, this solution is unique if and only if 
the inequalities in (3.11) are strict; see [16]. As for the constrained case, it fol- 
lows from the linear programming theory that (a^"*, c^^')' are the coefficients 

of a T-quantile hyperplane if and only if (3.11) holds with rj := c^^'Zj — 
in (3.12) (still with a unique solution when the inequalities are strict). 

We stress that no conditions (in particular, no moment conditions) are 
required here; only, the data points are assumed to be in general position. 

3.2. Equivariance properties. For the sake of simplicity, results for pop- 
ulation quantiles here are stated under Assumption (A); more general state- 
ments could be derived, however, by taking into account the possible nonunic- 
ity of the resulting r-quantiles (see Proposition 2.1). It is then easy to check 
that, with obvious notation, the affine-equivariance property 

(3.13) vr,Mu/||Mu|| (MZ + d) = M^,u(Z) + d 

holds for any invertible k x k matrix M and any vector d G M^. Since, 
moreover, ||rMu||/||Mu|| = ||tu||, (3.13) is also compatible with the general 
equivariance property advocated by [34] in his Definition 2.1. In particular, 
for translations, we have tTtuC^ -|- d) = tTt-uC^) + d for any /c-vector d, which 
confirms that our concept of multivariate quantiles is not localized at any 
point of the /c-dimensional Euclidean space; this was not so clear in Section 
2 since the center of the unit sphere S^~^ (the origin of M^) seems to play 
an important role in their definitions. This is in sharp contrast with other 
directional quantile contours that are defined with respect to some location 
center, such as those of [20] (under the terminology quantile biplots) and 
[37]. 

Note that for any r G (0, 1) and any u G ^, 

(3.14) vr(i_,),(Z) = ^,(_„)(Z) 

with the corresponding upper and lower halfspaces exchanged: int H^__^-^^{Zi) = 
int H^,_. (Z). Clearly, there is no general link between '7r^(_u) (Z) and 7rT-u(Z) 
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unless the distribution of Z is centrally symmetric with respect to some point 

3.3. Asymptotic results. This section derives, under Assumption (A„) 
above, strong consistency, asymptotic normality and Bahadur-type repre- 
sentation results for sample r-quantiles and related quantities. 

Under Assumption (A), the population r-quantiles {ar,h'^y and (aT-,c^)' 
always are uniquely defined (Proposition 2.1), unlike their sample counter- 
parts (a^\b^^')' and (a^\c^^')'; in the sequel, the latter notation will be 
used for arbitrary sequences of solutions to (2.5) and (2.6), respectively. 

Strong consistency of our sample r-quantiles, namely the fact that 
{a'^\h^-r^'y converges to (a^-jb!^)' almost surely as n — )• oo, holds under 
Assumption (A„); this follows, for example, from [13], Section 2.3. Asymp- 
totic normality and Bahadur-type representation results, however, require 
slightly stronger assumptions. Consider the following reinforcement of As- 
sumption (A„). 

Assumption (A^). The observations Zj, i = l,...,n are i.i.d. with a 
common distribution that is absolutely continuous with respect to the Lebesgue 
measure on R'^, with a density (/, say) that has a connected support, admits 
finite second-order moments and, for some constants C > 0, r > k — 2 and 
s > 0, satisfies 

(3.15) |/(zi)-/(z2)|<C||zi-Z2r(^l + 
for all zi,Z2 G R*". 

Condition (3.15) is very mild. In particular, for s = 1, it is satisfied by any 
continuously differentiable density / for which there exist some constants 
C>0, r > k — 2 and some invertible k x k matrix M such that 

sup ||V/(z)|| <C(l + i?2)-('^+4)/2 
||Mz||>R 

for all R> 0. Hence, Assumption (A^) holds, for example, when the Zj's are 
i.i.d. multinomial or elliptical t with v > 2 degrees of freedom. Differentiabil- 
ity however is not required, and (3.15) also holds, for instance, for elliptical 
densities proportional to exp(— ||Mz||) (which are not differentiable at the 
origin) . 

As we show in the Appendix (see the proof of Theorem 3.1), Assump- 
tion (A^) implies that the (strictly convex) function (a,b')' i— )• ^'T-(a,b) (see 



Zl + Z2 



-(S-l-r-i-s)/ z 
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Definition 2.1) is twice differentiable at {a-T-jh'^Y, with Hessian matrix 
n^ := [ (I )/((a,+b;x)u + r„x)dx 

= Ju ( I '^^ ^iia.- c;z)u + z) da(z)) J, =: J'^Hp^, 

where u-*- := {z G M*^ : u'z = 0} and Ju denotes the (k + l) x k block-diagonal 
matrix with diagonal blocks 1 and Fu. Strict convexity implies that H,- is 
positive semidefinite. Since, however, for all r and w := (t^o, v')' / 0, 

w'Ht-w= / (uo + v'x)^/((aT- + b^x)u + Tux) dx, 

Ht, under Assumption (A^), is positive definite for all r. 

Letting ^i,^(a,b) := -(r - I[u'Z,-b'r;,z,-a<o])Zi and ^l^ia,c) := -(r - 
I[c'z,-a<o])Zi, where := {l,Z'J, we have 

V,: = Var[JUi,xK,b,)] 

_ / T(l-r) T(l-r)E[Z'] \ 

V^(l - r)E[Z] Var[(r - I[z^,^-])Z] ^ ""^ 

= J'u Var [^5 (a, , )] Ju = : J'u Vpu . 

We are then ready to state an asymptotic normality and Bahadur-type rep- 
resentation result for our sample r-quantile coefficients, which is the main 
result of this section. 

Theorem 3.1. Let Assumption (A'^) hold. Then, 

/ (n) _ \ , n 

(3-16) V^iX) r = -^H;ij:,j;^,,(a.,b.) + op(l) 

(3.17) -^MkiO,H-^V^H-^) asn^oo. 

Equivalently, writing for the [k+l) x [k + 1) diagonal matrix with diag- 
onal (1, —1, ... , —1), 

/ (n) _ \ I n 

(3.18) =--=P,(H^)-J]^t-(a.,c.) + op(l) 

(3.19) 4AAfc+i(0,Pfe(H^)-V^(H^)-p',), 

where (H^)~ denotes the Moore-Penrose pseudoinverse o/H^. Moreover, 



(3.20) 



1 " 

V^(Ai") - A,) = ^ V(p,(c;z, - a^) - A,) + op(l) 
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(3.21) 4 AA(0,Var[/j^(c;Zi - a^)]). 

As Pr(') is a nonnegative function, the distribution of y/n{\^T^ — \t) is 
likely to be skewed for finite n [see (3.20)], which can be partly corrected via a 
normalizing transformation such as that from [8]. Also, the proof of the above 
theorem can be easily generalized to derive the asymptotic distribution of 
vectors of the form {a^^} , b^"'' , . . . , a^^] , b^"^ )', J e Nq. 

Theorem 3.1 of course paves the way to inference about r-quantiles; in 
particular, it allows to build confidence zones for them. Testing linear re- 
strictions on r-quantiles coefficients — that is, testing null hypotheses of the 
form ?^o:(aT,b;)' GX(ao,bo,T) := {(oq, b[,)' Tv : v G M^} (indexed by 
some /c-vector (ao,bQ)' and some full-rank k x I matrix T, I < k) — can be 
achieved in the same way as in [25]. Defining and studying such tests re- 
quires a detailed investigation of the asymptotic behavior of the constrained 
estimators 

(ai"'^ bi"^')' := argmin ^'i"^(a,b), 

{a,b')'G>1(ao,bo,T) 

which is beyond the scope of this work. 

4. Multivariate quantiles as depth contours. Turning to the contour na- 
ture of our multivariate quantiles, we first define the (population and sample) 
quantile regions and contours that naturally follow from Definitions 2.1 and 
2.2 and their empirical counterparts, and state their basic properties. We 
then establish the strong connections between those regions/contours and 
the classical Tukey half space depth regions/ contours. Finally, we compare 
our results with those of Kong and Mizera [20] (Section 4.3) and Wei [37] 
(Section 4.4). 

4.1. Quantile regions. The proposed quantile regions are obtained by 
taking, for some fixed t(= ||t||), the "upper envelope" of our r-quantile hy- 
perplanes. More precisely, for any r € (0, 1), we define our r-quantile region 

R{t) as 

(4.1) R{t):= fl n{i7+}, 

where n{i?^} stands for the intersection of the collection {ff^} of all 
(closed) upper (Tu)-quantile halfspaces (2.3); for r = 0, we simply let R{t) := 
R*^. The corresponding r-quantile contour then is defined as the boundary 
dR{T) of R{t). At this stage, it is already clear that those r-quantile re- 
gions are closed and convex (since they are obtained by intersecting closed 
halfspaces). As we will see below, they are also nested: R{ti) C R{t2) if 

n > T2. 
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Empirical quantile regions (r) are obtained by replacing in (4.1) the 
population quantile halfspaces with their sample counterparts hI^~^ , 
yielding, parallel to (4.1), 

(4.2) i?W(r):= f| n{/7(^^+} 

for any r G (0,1), with R^"'\0) := M'^. Since they result from intersecting 
finitely many closed halfspaces, these empirical quantile regions are closed 
convex polyhedral sets, the faces of which all are part of some quantile hy- 
perplanes of order r. Another important property of our empirical regions, 
which readily follows from the equivariance properties of Section 3.2, is that, 
for any invertible k x k matrix M and any /c-vector d, using obvious nota- 
tion, 

(r; MZi + d, . . . , MZ„ + d) = Mi?(") (r; Zi, . . . , Z„,) + d. 

Similarly, the population regions, in view of (3.13), satisfy the affine-equiva- 
riance property R{t; MZ + d) = Mi?(r; Z) + d for any such M and d. 

4.2. Connection with halfspace depth regions. Recall that the halfspace 
or Tukey depth [36] of z G M'^ with respect to the probability distribution 
P is defined as HD{z, P) := inf{P[i/] -.H is a closed halfspace containing z}. 
The halfspace depth region D{t) of order r E [0, 1] associated with P then 
collects all points of the A;-dimensional Euclidean space with depth at least 
r, that is, 

(4.3) D{t) = Dp{t) := {z G M'^ : HD{z, P) > r}. 

Clearly, D{0) = W^. Also, it is well known (see Proposition 6 in [30], or the 
proof of Theorem 2.11 in [39] for a more general form) that, for any r > 0, 

(4.4) D{t) = n{H -.H isa closed halfspace with P[Z G F] > 1 - r}. 

The empirical version D^^''\t) of D{t), as usual, is obtained by replacing, in 
(4.3) and (4.4), the probability measure P with the empirical measure asso- 
ciated with the observed n-tuple Zi, . . . , Z„ at hand. As shown by the follow- 
ing results, the population halfspace depth regions, under Assumption (A), 
coincide with the quantile regions R{t) defined in (4.1), and so do — almost 
surely under Assumption (A„) — their empirical counterparts L>("^(r), when- 
ever their interior is not empty, with the empirical quantile regions R^^\t) 
(see the Appendix for the proofs). 



Theorem 4.1. Under Assumption (A), R{t) = D{t) for all t £ [0,1). 
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Theorem 4.2. Assume that the n{> /c + l) data points are in general po- 
sition. Then, for any £ E {1,2, ... ,n — k} such that D^'^\-^) has a nonempty 
interior, we have that i?(")(r) = D^'^\^) for all positive r in [^^, 

Theorem 4.1 of course implies that, under Assumption (A), all results on 
the halfspace depth regions D{t) also apply to the R{t) regions. It follows 
that the R{t)^s are compact; the supremum of all r's such that R{t) ^ 
belongs to [l/{k + 1), 1/2], and takes value 1/2 if and only if the distribution 
of Z is angularly symmetric — in the sense that there exists some /c-vector 6 
such that ii^^gy and — \\z-e\\ ^^^^^ same distribution (see [30] and [32] ) . 
This implies that, under Assumption (A), we also may restrict to r G [0, 1/2]. 
As for Theorem 4.2, note that the restriction to halfspace depth regions with 
nonempty interiors is not really restrictive, since it only applies to flat deep- 
est regions. Another major consequence of this relation between halfspace 
depth and multivariate quantiles is that our sample multivariate quantiles, 
just as the traditional univariate ones, completely determine (under the as- 
sumptions of Theorem 4.2) the underlying empirical distribution P„ — since 
depth contours do (see [35]). This essentially extends to the population case 
as well (see, e.g., Section 8 of [20] for a discussion). 

Beyond that, Theorems 4.1 and 4.2, by showing that the halfspace depth 
regions coincide with the upper envelope of directional quantile halfspaces, 
and that the faces of the polyhedral empirical depth contours are parts of 
empirical quantile hyperplanes, provide depth contours with a straightfor- 
ward quantile-based interpretation. Above all. Theorem 4.2 brings to the 
halfspace depth context the extremely efficient computational features of 
linear programming. This important issue is briefly discussed in Section 5; 
we refer to [27] for details. See Figure 3 for two- and three-dimensional 
illustrations. 

4.3. Relation with projection quantiles. In this section, we discuss the 
relation of our approach to the results of Kong and Mizera [20] on projec- 
tion quantiles. These results are somewhat similar to ours, since they also 
lead to a reconstruction of Tukey's halfspace depth contours. As explained 
in the Introduction, their r-quantile is a point in the sample space; denoting 
by T ^ the traditional quantile function associated with the univariate 
random variable X, the r-quantile (/kM;x = '?KM;ru of a random vector Z (ac- 
tually, of its distribution) is defined as g"'^u, with upper and lower quantile 
halfspaces 

(4.5) ^KM;ru ■= {z G : "'z > u'qKM;ru} 

and 

-^KM;tu •= {z G : u'z < u'qKM;Tu}, 
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(a) 
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(b) 




Fig. 3. Tukey contours D^'^\t) (in green) obtained for n = 449 from f/([— 0.5,0.5]'"), 
A/'(0,1)'' , and t\ (the products of k independent uniform, standard Gaussian and Cauchy 
distributions, respectively), (a) for k = 2 and r G {0.01, 0.05, 0.10, 0.15, 0.20, 0.45}, 
and (b) for k = 3 and r £ {0.05, 0.10, 0.15, 0.20, 0.40}. For the same n and 
t's, the Tukey depth contours (in green) from the mixtures (with obvious notation) 
0.2 X J\f{1.5, 1)* + 0.8 X A/'( — 1.5,3)* are provided for k = 2 and 3 in (c), along with the 
density contours (in blue) for k = 2. Only the contours falling in the plotting range are 
displayed. 

respectively, and quantile hyperplane 7rKM;ru := {z G ]R^':u'z = u'qKM;ru}- 
Note that those hyperplanes, contrary to ours, are orthogonal to u, so that 
the relation between u and vrKM;Tu does not carry any information. Kong 
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and Mizera show that 



(4.6) 



-Rkm(t) := 1^ {^^KM;ru} = ^i'^') T 



and that 



(4.7) 



( 



n 



) 



for any r S 



£-1 i 



n 



n 



) 



(see [20] and [26] for different proofs of this latter equahty), where R]IIi{t) 
stands for the empirical version of i?KM(''")) obtained by replacing P with 
the empirical measure P„ associated with a sample of size n. 

The results in (4.6) and (4.7) at first sight look pretty equivalent to those 
of Theorems 4.1 and 4.2, since they also establish a close connection be- 
tween depth and directional quantiles — here, the Kong and Mizera ones. 
That connection in (4.6) and (4.7), however, is much less exploitable than 
in Theorems 4.1 and 4.2. It does provide the faces of the polyhedral empirical 
depth regions D^"'\t) with a neat and interesting quantile interpretation: 
each face of D^") (r) indeed is part of the Kong and Mizera quantile hyper- 

(n) 

plane tt^m-tuo' '^'^^re uq stands for the unit vector orthogonal to that face 
and pointing to the interior of D^"^(r). Unless the depth region D^"'\t) is 
available from some other source, this is not really helpful, though, since, 
contrary to the collection {ir^}, which is finite, the collection {Trj^j^.^^}, for 
fixed r, contains infinitely many hyperplanes (one for each u G S^~^). And, 
since the definition of the upper envelopes R^^Ii{t) of halfspaces H^^^^^ 
involves an infinite number of such -f^KM^u's, (4.7), contrary to Theorem 
4.2, does not readily provide a feasible computation of D^"'\t). It is crucial 
to understand, in that respect, that our quantile halfspaces h'^^ are piece- 
wise constant functions of u, in sharp contrast with their Kong and Mizera 
counterparts h'^^^^: since dH^^^^ is orthogonal to u for any direction 
u, there are uncountably many such upper halfspaces in any neighborhood 
of any fixed direction u, even in the empirical case. To palliate this, Kong 
and Mizera [20] propose to sample the unit sphere S^~^ , which leads to 
approximate envelopes, that only approximately satisfy (4.7). Moreover, de- 
noting by U a random vector uniformly distributed over S^~^ (independent 
of the sample), the probability that the corresponding quantile hyperplane 

in) 

^KM-rU contains some face of the Tukey depth contour of order r is zero: 
with probability one, the proposed approximation, thus, fails to recover any 
of the faces of the actual depth contours. And, for a given sample size, the 
quality of the approximation deteriorates extremely fast as k increases. 
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4.4. Relation to Wei's conditional quantiles. Another definition of mul- 
tivariate quantiles, whicfi also extends from location to multiple-output re- 
gression, has been proposed by Wei in [37]; see also [38]. Just as Kong and 
Mizera's projection quantiles and ours, Wei's quantiles are directional quan- 
tiles, associated with unit vectors u G /x + S^~^; a center /i S M'^ here has to 
be chosen for the unit sphere — a choice that does have an impact on the final 
result. Unlike Kong and Mizera's and ours, which are characterized globally, 
Wei's quantiles, however, are conditional ones: the quantiles associated with 
u indeed follow from conditional (on u) outlyingness probabilistic charac- 
terizations. As a consequence, they are of a local (with respect to u) nature, 
and their empirical versions therefore unavoidably involve some nonpara- 
metric smoothing steps (see Remark 1 on page 399 of [37]). The resulting 
contours are not convex — hence cannot coincide with depth contours — and 
strongly depend on the choice of the centering fi. 

5. Computational aspects. Computational issues in this context are cru- 
cial, and we therefore briefly discuss them here. We first restrict to the prob- 
lem of computing (fixed-u) directional quantiles and related quantities such 

as the corresponding Lagrange multipliers A^"* in (3.8c), then consider the 
computation of (fixed-r) quantile contours. 

5.1. Computing directional quantiles. As we have seen in the previous 
sections, the constrained formulation (2.4) of the definition of our direc- 
tional quantiles is richer than the unconstrained one (2.1), since it introduces 
Lagrange multipliers, which bear highly relevant information (that can be 
exploited for statistical inference; see Section 8). It is therefore natural to 

focus on the computation of the sample quantiles (ai^^c^^')' in (2.6) first. 

The problem of finding (a^\c^^')' can be reformulated as the linear 
program (P) 



subject to 

(5.1) u'c = l, Z^c-al„-r+ + r_ =0, r+>0,r_>0, 

where we set Z„ := (Zi, . . . , Z„) and write r_|_ := (max(ri,0), . . . ,max(r„,0))' 
and r_ := (— min(ri, 0), . . . , — min(r„, 0))'. Associated with problem (P) is 
the dual problem (D) 



mm 

{a,c',r' ,r'_)'eIRxR''xR"xR' 



'71 



max 

(AD,/i')'eKxR' 



■n 



subject to 



1> = 



Az)U + Z„/x = 0; 



t1„</x<(1-t)1. 
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where Ad and fx are the Lagrange multiphers corresponding to the first 
and second equahty constraint in (5.1), respectively. Both (P) and (D) have 
at least one feasible solution (and therefore also an optimal one). This dual 
formulation leads to a natural multiple-output generalization of the powerful 
concept of regression rank scores introduced in [12], allowing for a depth- 
related form of rank-based inference in this context. This promising line of 
investigation is not considered here, and left for future research. 

We need not worry about the possible nonunicity of the optimal solu- 
tions of (P) since, as we have seen in Section 3.3, any sequence of such 
solutions converges [under Assumption (A„)] to the unique population coef- 
ficient vector {aT,c'^y almost surely as n — >■ oo. In practice, one could com- 
pute (ai"\ct"^')' by means of standard quantile regression of 0„ on (1„|Z^) 
with an extra pseudo-observation consisting of response C and correspond- 
ing design row (0, Cu') for some sufficiently large constant C, which, in the 
limit, guarantees that the boundary constraint u'c,- = 1 is satisfied; see [2] 
for another application of the same trick. 

Now, since Xd and are Lagrange multipliers associated with the same 
constraint, the optimal value A^ of (D) satisfies 



where, in view of (3.8c), Af" has a clear meaning. Besides, due to the Strong 
Duality Theorem, the optimal values of the objective functions in (P) and 
(D) coincide. Therefore, A^"* is always unique and one has, with ^I'^'-"'* defined 



(except for the rare case of exact fit where = 0), which holds for all 
optimal solutions to (P) and (D). In other words, A^^ can be obtained from 
solving (P) as a by-product. 

Most importantly, (5.2) allows us to focus on computing our r-quantiles 
through the unconstrained problem (2.5) without any loss of generality be- 
cause we may simply set A^"^ = \I'i"^ (ai"^ , bi"^ ) . This approach is of course 
advantageous because it falls directly into the realm of quantile regression, 
as the problem of finding the sample r-quantiles in (2.5) can be viewed as 
looking for standard — that is, single-output — regression quantiles in the re- 
gression of Zu on the marginals of and a constant (in the notation of 
Section 2). 

Needless to say, this interpretation has a large number of implications. 
Above all, it offers fast, powerful and sophisticated tools for computing sam- 
ple T-quantiles (along with the corresponding Lagrange multiplier A^'^'') in 
any fixed direction u and possibly for all r's at once, with r = ru as usual. 



Ad = nXr 



(n) 



in (2.6) 



(5.2) 
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In particular, there is an excellent package for advanced quantile regression 
analysis in R (see [17]) and the key function for computing quantile regres- 
sion estimates is also freely available for Matlab, for example, from Roger 
Koenker's homepage at http:/ /www. econ.uiuc.edu/~roger/research/rq/rq. html. 

5.2. Computing quantile contours. As the previous subsection shows that 
the computation of is pretty straightforward, we now turn to the prob- 

lem of aggregating, as efficiently as possible, the information associated with 
the various fixed-r directional quantile halfspaces in order to compute the 
regions defined in (4.2). The main issue here lies in the proper identi- 
fication of the finite set of upper quantile halfspaces characterizing R^"'\t). 
This can be achieved efficiently, for any given r 7^ S {0, 1, . . . ,n}, via 
parametric linear programming techniques. By restricting (here and in Fig- 
ures 1 to 8) to such r values, we avoid — without any loss of generality, in 
view of Theorem 4.2 — the problems related with possibly multiple solutions 
of (P) for integer values of nr. 

For any fixed t" 7^ ^, parametric linear programming indeed reveals that, 
under Assumption (A„), M'^ almost surely can be segmented into a finite 
number of nondegenerate cones ^^(t), i = 1, 2, . . . , Nq, such that 

(a[u\c^u^') = (ai,c^)/t^u 

rv^u/t^uG[-T,l-T], ifr,=0, 

I^Tru = -r, if rj > 0, 

[ 1 - T, if rj < 0, 

with rj := c'-Z^ — aj, for any u E Cj(r) n S''~^, i = 1,2, ... , Nq and j = 
1, . . . ,n; see [27] for further details. Each cone Cj(T) then corresponds to one 
optimal basis Bj = Bj^u that uniquely determines constant scalars and vec- 
tors Aj, Oi, Cj, Vjj and tj and guarantees that t^u > for any u G Ci(r)n5'^~^. 
Consequently, each cone Cj(r) corresponds to exactly one quantile hyper- 
plane, and any statistic 5u of the form 

u, flu, Cu )/52(A 

is piecewise constant on the unit sphere whenever gi[\,a,c) and 5(2 (A, a, c) 
are homogenous functions of the same order. Figure 4 provides such cones 
for a bivariate dataset. 

It remains to note that we may investigate all the cones Ci (r) 's by passing 
through them counter-clockwise when k = 2. In general, we can use the 
breadth- first search algorithm and always consider all such Cj(r)'s that are 
adjacent to a cone treated in the previous step and have not been considered 
yet. If Cj(r) and Cj(r) are adjacent cones with point uj inside their common 
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-1 -0,5 Q 0,5 1 -C,5 0,S 




-1 -0,5 a 0,5 1 -0.5 -0 0,5 



Fig. 4. The eight cones Ci(O.l) obtained for t = 0.1 (top left) and 18 cones Ci(0.2) 
obtained for r = 0.2 (bottom left) via parametric linear programming from the same n = 9 
points as in Figure 1, along with the corresponding (color matching) (r — 0.1)- quantile 
hyperplanes (top right) and [t = 0.2) -quantile hyperplanes (bottom right). 

face, then Bj^u^. (and consequently also Bj^u) may be found from the primal 
feasible basis Bj^uj, by only a few iterations of the primal simplex algorithm 
at most. 

Moreover, a careful reading of the proof of Theorem 4.2 reveals (see the 
remark right after the proof) that a single fixed-r collection of quantile 
hyperplanes {tt^ :u G typically contains all hyperplanes relevant for 

the computation of k consecutive Tukey depth contours. Technical details 
are provided in [27]. A Matlab implementation of the procedure, which was 
used to generate all the illustrations in this paper, is available from the 
authors. 



24 



M. HALLIN, D. PAINDAVEINE AND M. SIMAN 



6. Multiple-output quantile regression. Our approach to multivariate 
quantiles also allows to define multiple- output regression quantiles enjoying 
all nice properties of their classical single-output counterparts. 

Consider the multiple-output regression problem in which the m-variate 
response Y := {Yi, . . . jY^)' is to be regressed on the vector of regressors 
X := {Xi, . . . jXpY , where Xi = 1 a.s. and the other Xj^s are random. In 
the sequel, we let X =: (1,W')', so that {(w',y')':w £ W-^,y G M"*} = 
X M™ is the natural space for considering fitted regression "objects." 
Multiple-output regression quantiles, in that context, can be obtained by 
applying Definition 2.1 to the /c-dimensional random vector Z := (W, Y')', 
k = p + m — 1 , with the important restriction that the direction u should be 
taken in the response space only, that is, u G S^^^ := {Op_-i} XcS™~^ C S^"^. 
This directly yields the following definition. 

Definition 6.1. For any t = ru, with r G (0, 1) and u = (Op„]^,Uy)' G 
S^i^ , the regression r-quantile of Y with respect to X = (1, W')' is de- 



fined as any element of the collection 11^ of hyperplanes tTt := {(w',y')' G 
^P+m-i . ^/^y ^ b;r'„(w', y')' + a^} such that 



where, denoting by Fu an arbitrary (p + m— 1) x {p + m—2) matrix such that 
(u:Fu) is orthogonal, we let *T(a,b) := E[p^(u^Y - b'F'„(W', Y')' - a)]. 

Although — similarly as in Definition 2.1 — the choice of Fu has no impact 
on the directional regression quantile tt-t, it is here natural to take Fu of the 
form 



where Ip-i denotes the {p — l)-dimensional identity matrix and the m x 



regression quantiles can be obtained by extending Definition 2.2 in the same 
fashion; see [26]. 

Now, as in the location case, each quantile hyperplane tTt characterizes a 
lower (open) and an upper (closed) regression quantile halfspace defined as 



(6.2) H- := {(w', y')' G W^"'-' : u^y < b;yF;,^y + b^^w + a,} 



(6.1) 



(aT-,b!^)'G argmin ^',-(0, b) 

{a,b')'eKP+™-i 




(m — 1) matrix Fu^ is such that (uyiFu^) is orthogonal. The directional 
regression quantiles in Definition 6.1 then take the form 
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and 

(6.3) := {(w', y')' G M^'+'"-^ : u^y > KyK^y + b^w + a^}, 

respectively. Most importantly, for fixed r(= ||t||) G (0, 1), (multiple-output) 
T-quantile regression regions are obtained by taking the "upper envelope" 
of our regression r-quantile hyperplanes. More precisely, for any r G (0, 1), 
we define regression r-quantile regions i?rcgr(T) as 

(6.4) iiregr(r):= fl n{if+} 

[with corresponding regression quantile contours dRrcgriT)], where iif^ de- 
notes the (closed) upper regression (Tu)-quantile halfspace in (6.3). Unlike 
the location quantile regions (p = 1), regression quantile regions {p> 1) may 
be nonnested — an m-dimensional form of the familiar regression quantile 
crossing phenomenon. 

Finite-sample versions of all regression concepts above are obtained, simi- 
larly as in the location case (Section 2), as the natural sample analogs of the 
corresponding population concepts; see Figures 5 and 6 for an illustration. 
From a numerical point of view, Section 5.2, with obvious minor changes, still 
describes how to compute the resulting regression quantile regions i?reJr(T), 
with m and Uy substituded for k and u, respectively. 

The Kong and Mizera projection approach also readily generalizes to the 
multiple-output regression setting. This issue is briefly addressed in Section 
11.3 of [20]; see [26] for a detailed comparison with our approach. As for the 
conditional quantiles of Wei [37], their regression version shares the same 
local features as their location counterpart. 

7. A real data application. In order to illustrate the implementability 
and data-analytical power of the concepts we are proposing, we now con- 
sider a real data example. Since a thorough case study is beyond the scope 
of this paper, we only present some very partial results of an investigation 
of the body girth measurement dataset considered in [14]. That dataset 
consists of joint measurements of nine skeletal and twelve body girth di- 
mensions, along with weight, height and age, in a group of 247 young men 
and 260 young women, all physically active. We refer to [14] for details; 
note, however, that these n = 507 observations cannot be considered a ran- 
dom sample representative from any well-defined population, so that the 
regression quantile contours we are providing below should be taken from a 
descriptive/illustrative point of view only. 

For each gender, taking as regressors a constant term and (with nota- 
tion W) either weight, age, height or the body mass index (defined as 
BMI:=weight/height^), we considered all (^'^^^) = 210 possible bivariate 
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jr, 



'2 2 4 e s ; 10 



Fig. 5. Two different views on the regression r-quantile contours (in 
green) from n = 9,999 data points for r G {0.01,0.05,0.15,0.30,0.45} m a 
homoscedastic [{Yi,Y2)' = (X2,X2)' + (61,82)'; left] and a heteroskedastic 
[{Y\,Y2)' = {X2,X2)' + ^/X2{el,e2)' ; right] bivariate- output regression setting, re- 
spectively, where X2 ~ C/([0,4]), and ei and £2 are independent centered Gaussian 
variables with variances 1 and 9, respectively. 

output regression models, and computed the regression tubes for r = 0.01, 
0.03, 0.10, 0.25 and 0.40, respectively. Three-dimensional pictures of those 
tubes are not easy to read, and we rather plot, for each of them, a series 
of five cuts. These cuts were obtained as the intersections of the regression 
tube under study with hyperplanes of the form w = w;(p) , where Wi^-^ stands 
for the (empirical) pth quantile of the covariate W, p = 0.10 (black), 0.30 
(blue), 0.50 (green), 0.70 (cyan) and 0.90 (yellow). The results are presented, 
for women, Yi the calf maximal girth, and Y2 the thigh maximal girth, with 
W the weight, age, BMI and height, respectively, in Figure 7. 

Results look quite different depending on the choice of regressors. Re- 
gression with respect to weight shows a clear positive trend in location (all 
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Fig. 6. Various cuts of the regression r-quantile "hypertube" contours from the same two 
models (left and right, respectively) as in Figure 5 with n — 9,999 observations. The top 
plots provide regression r-quantile cuts, r £ {0.05,0.10,0.15,0.20,... ,0.45}, through 10% 
(magenta), 30% (blue), 50% (green), 70% (cyan) and 90% (yellow) empirical quantiles of 
X2; the bottom ones show regression r-quantile cuts for the same r values, and through 
25% (blue), 50% (green) and 75% (yellow) empirical quantiles ofYi. Their centers provide 
information about trend and their shapes and sizes shed light on variability. 

contours), along with an increasing dispersion, and an evolution of "principal 
directions," yielding higher variability in calf than in thigh girth for lighter 
weights (horizontal first "principal direction"), while heavier weights tend 
to exhibit the opposite phenomenon (vertical first "principal direction"). 
Quite on the contrary, regression with respect to age apparently does not 
reveal any location trend: the inner contours almost coincide for all age cuts, 
and "principal directions" (roughly, parallel to the main bisectors) appar- 
ently do not change with age. However, the shapes of outer contours vary 
quite significantly with age, indicating an increasing (with age) simultaneous 





Fig. 7. Four empirical regression quantile plots from the body girth measure- 
ments dataset (women subsample; see [14])- The regression models considered are 
("Ki, y2)' = /3'(1, W)' + (ei, £2)', where Yi is the calf maximum girth andY2 the thigh max- 
imum girth, while W stands for weight (top left), age (top right), B MI index (bottom left) 
or height (bottom right). The plots are providing, for r = 0.01, 0.03, 0.10, 0.25 and O.4O, 
the cuts of the empirical regression r-quantile contours, at the empirical p-quantiles of the 
regressors, for p = 0.10 (black), 0.30 (blue), 0.50 (green), 0.70 (cyan) and 0.90 (yellow). 
Data points are shown in red (the lighter the red color, the higher the regressor value). 



variability of both calf and thigh girth largest values. While the results for 
BMI look very similar to those for weight, a new phenomenon appears when 
height is the regressor, namely a clear regression effect for some contours 
(the inner ones) but not for the others, so that the asymmetry structure 
of the conditional distribution strongly depends on height: the conditional 
distributions seem much more asymmetric for low values of height than for 
the higher ones. 

These variations in location, scale, shape and asymmetry structures clearly 
yield a much richer and subtle analysis of the impact of weight/age/BMI/height 
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on those body measurements than any traditional regression method can 
provide. 

8. Final comments. This work presents a new concept of multivariate 
quantile based on Li optimization ideas and clarifies the quantile nature 
of halfspace depth contours, while providing an extremely efficient way to 
compute the latter. The same concept readily allows for an extension of 
quantile regression to the multiple-output context, thus paving the way to a 
multiple-output generalization of the many tools and techniques that have 
been based on the standard (single-output) Koenker and Bassett concept of 
quantile regression. This final section quickly discusses several open prob- 
lems, of high practical relevance, that could now be considered. 

First of all, Section 6 only very briefly indicates how our multivariate 
quantiles extend to the context of multiple-output regression; that exten- 
sion clearly calls for a more detailed study, covering standard asymptotic 
issues (limiting distributions, Bahadur representations) as well as robust- 
ness aspects (breakdown points and influence functions). Nonlinear quantile 
regression problems also should be addressed via, for instance, local linear 
methods. 

The regression rank score perspectives (associated with linear program- 
ming duality) sketched in Section 5.1 also look extremely promising, possi- 
bly leading to the development of a full body of multivariate, depth-related 
methods of rank-based inference. 

Finally, as mentioned in the Introduction and in Section 3.1, various con- 
cepts introduced in this paper can be quite useful for inference. As an ex- 
ample, note that the symmetry (central, elliptical or spherical) structure of 
P is reflected in the mappings 

uh-> Aruu/A!^°°^ and u i-^ ||c^u||u/4°°\ 

with Ar°°^ := supug^fc-i A^u and ci°°^ := sup^g^fe-i ||ct-u||, as illustrated (with 
the corresponding empirical quantities, of course) in Figure 8. A test of the 
hypothesis that the density of Z is, for example, spherically symmetric thus 
could be based on [the empirical version 

j(n) ._ j^^p^) ^ functional of 

the form 

5 ( 7" , 1 I da(u) w(t) dr, 

where d{-,-) denotes some distance (such as that of Cramer- von Mises), w 
some positive weight function over (0,1), and a the uniform measure over 
iS'^"^. Deriving the asymptotic properties of such statistics, however, clearly 
requires uniform versions of the asymptotic results in Theorem 3.1. 






■1 O.S 0,5 1 -1 -O.S 5,S 1 



Fig. 8. Polar plots of the mappings u G 5^ Atu'u/(sup^g5i A^v') (left) a.nd u G iS'^ 
liciu' ||u/(sup^g5i ||civ'||) (right), for t = 0.1, and n = 49,999 points (top) [resp., n = 299 
points (bottom)] drawn from Af (0,1)^ (the product of two independent standard Gaussian 
distributions, in green), f/([— 0.5,0.5]^) (the centered bivariate uniform distribution over 
the unit square, in blue), and (Exp(l) — 1)'^ (the product of two independent standard 
exponential distributions, m red) populations, respectively; see Section 8. The resulting 
shapes clearly reflect the axes of symmetry of the underlying distributions. 



APPENDIX 

Proof of Theorem 3.1. The quantity 77^ ,^(0, b) := J'^^j .^(ojb) is a 
subgradient for (a,b) 1— pT-{Ziu — b'Z^ — a) since, for all (a,b')', (ao,bQ)' G 
M'^, we have that 

/9r(Ziu - b'Z,4 -a) - Pr{Ziu - bpZ^ - ao) - (a - ao,h' - bo)j7j ,^(ao, bo) 
= (I[u'z,-b^r^„z,-ao<o] - I[u'z,-b'r^„z,-a<o])(u'Zj - bT-^Zi - a) > 0, 
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irrespective of the value of Zj. Hence, interchanging differentiation and 
expectation (which is justified in a standard way) shows that (a,b')'i— )■ 
'^t{0', b) (see Definition 2.1) satisfies grad^'T(a, b) = gradE[pT-(Ziu — b'Z,^ — 
a)]=E[rii^^{a,h)]; see (3.2a) and (3.2b). Therefore, 

grad ^-^(0^ + Aa, b^ + Ab) - grad ^^{a^,h^) - U^{Aa, A'J' 

r-(aT-+Aa)+(b^+Ab)'x 



/ (/(zu + r^x) 

J a-r+hLx. 



-1 Ja^+h'^x 

-/((a^ + b;x)u + rux))(l,x')'(izdx 

and Assumption (A^) yields that 

||grad*^(a^ + Aa,h^ + Ab) - grad ^'^(a^, b^) - U^{Aa,A'J'\\ 

r /■(a,-+Aa) + {b,- + Ab)'x 

<C / (|z-(a. + b;x)n|(l,x')'||) 

/((l + ||l/2(z + a^ + b;x)u 



+ r,xf)(3+'-+-)/2)d^ 

A, + A'bxni(l,x' 



dx. 



<C||(A.,Ay'||'+' / ||(l.x')'||-<'-+"<ix = o(||(A.,AU'| 



as ||(Aa, Ai,)'|| -)• 0. This shows that (a,b')' i-)- ^'T-(a,b) is twice differen- 
tiable at (aT-,bI^)' with Hessian matrix Ht-. Since, moreover. Assumption 
(A(j) clearly ensures that E[||j7j .^(a, b)|p] < oo for all (a,b')' G M'', Theo- 
rem 4 of [25] applies, which establishes (3.16). Of course, (3.17) results from 
(3.16) by the multivariate CLT. 

Recall that, under Assumption (A), the unique solution of (2.1) can be 
written as (a^ib^)' := (a,-, (— r'^CT-)')', where {a-r^c'^)' denotes the unique 
solution of (2.4). Similarly, any solution (ai"'^ bi"^')' of (2.5) is related to 
some solution {a^\c^^')' of (2.6) via the relation {a^\\J^^')' = 
(a^\ (— r^c^^)')'. This allows for rewriting (3.16) as 

/ (n) _ \ , n 

(A.l) V^P, JM _ =-— H;ij:,^^^,,(a.,c.) + op(l) 

\C-r Ct J 

as ?i — )• oo. By first premultiplying both sides of (A.l) with P^Ju, then 
using PuF^ = \k ~ uu' [which follows from the orthogonality of (u:Pu)] and 
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u c-;- = 1 = u Ct, we obtain 

/ (n) _ \ , n 

^ ( C^"^) _lj= -^^kJuU-'j'^ ^U^r,C.) + 0P(1) 

as n — 5- oo. Lemma A.l below therefore establishes (3.18). Again, the multi- 
variate CLT then trivially yields (3.19). 

Finally, applying Theorem 6 in [25] [more precisely, applying the version 
(a) of statement (3.8) in that theorem] with L = 1^ and c= (aT-,b!^)' yields 

n*i"^(G,, W) - n^i")(ai"\bi")) 

- ^ E ^U(«-b.)JuH;ij',^^.,(a,,W) = op(l) 

as n — 7- oo. Note that the third term is clearly Op(l) as n — ?• oo. The result 
then follows by dividing both sides of (A. 2) by -^/n, and by using the iden- 
tities Ai"^ = ^i^Vi" \ bi"^) (see the end of Section 5.1) and u'z - b;r;,z - 
a-r = c'^z — a-r for all z G M.^. Since (3.7) entails A,- = E[pT-(c^Zj — Ox)], the 
CLT yields (3.21). □ 



In order to complete the proof of Theorem 3.1, it is sufficient to establish 
the following lemma. 

Lemma A.l. The matrix Gt := Ju(JuH^Ju)~^ Ju the Moore-Penrose 
pseudoinverse of H^, that is, Gt is such that (i) Gt-H^G,- = G^, (ii) 
H^GxH^ = H^, (iii) (GxH^)' = GxH^ and (iv) (H^G^)' = H^G^. 



Proof, (i) This directly follows from trivial computations, (ii) Let Ku 
be the invertible matrix (Juiu), where u := (0,u')'. Clearly, (H^Gt-H^ — 
H^) Ju = and the definition of implies that u belongs to the null space 
of H^. Hence, (H^Gt-H^ — H^)Ku = 0, which establishes the result, (iii) 
and (iv) Since J'„Ju = Ifc, (G,H^ - H^G^)Ju = - n%3^{3'^U%3^)-^ = 
0; the last equality follows, as in the proof of part (ii), by showing that (Ju — 
H^Ju(JuH^Ju)"^)'Ku = 0. Now, as we also have that (G^H^ -H^G^)u = 
0, we conclude that (G^-H^ - H^Gi-)Ku = 0, hence that G^-H^ = H^Gt- 
This establishes (iii) and (iv) since both and G,- are symmetric. □ 



Proof of Theorem 4.1. Under Assumption (A), it directly follows 
from (4.4) that, for any r € (0,1) (note that Theorem 4.1 trivially holds 
for r = 0), D{t) = r\{H:H is a closed halfspace with P[Z e H]>1 - t}. 
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Consequently, by noting that any -f^xM-ru' ^ ^ ^ [see (4.5)] satisfies 
P[Z € -f^KM-Tu] = ^ ~ T under Assumption (A), it follows from (4.6) that 

D{t) C n{H : if is a closed halfspace with P[Z G if] = 1 - r} 

C n {^KM;.u} = ^W> 

which entails that, still under Assumption (A), 

(A.3) D{t) = n{H : if is a closed halfspace with P[Z G ii] = 1 - r}. 

Now, since (3.2a) [equivalently, (3.5a)] implies that any closed quantile 
halfspace ii^^^, u G 5'^"^, satisfies P[Z G ii^^u] = 1 — ''"i (A.3) yields that 
D{t) C R{t). To show D{t) D i?(T), consider an arbitrary closed halfspace 
H with P[Z G if] = 1 - r. Then if = if+,, with 

(1/(1 - r))E[ZI[zeH]] - (l/r)E[ZI[z^i,.\^]] 



U/(l - T))E[ZI[zeH]] - (l/r)E[ZI[zgKnH]]ll ' 
so that R{t) C D{t); see (3.6) and (A.3) again. □ 

Proof of Theorem 4.2. We start with some remarks on sample halfs- 
pace depth regions. By (4.4), f)(")(^), for any £ e {1,2, . . . ,n- k}, coincides 
with the intersection of all closed halfspaces containing at least n — £+1 ob- 
servations. Actually, one can restrict to closed halfspaces containing exactly 
n — £+1 observations (see [9], page 1805). Also, it can be shown (see [11]) 
that f)(")(-) — provided that its interior is not the empty set — is bounded 
by hyperplanes containing at least k points that span a (A; — l)-dimensional 
subspace of M^' . 

Now, fix £ G {1, 2,. . . ,n — k} such that D^"'\^) has indeed a nonempty 
interior. Consider an arbitrary closed halfspace if containing exactly n — 
£ + 1 data points, among which exactly k (Zj, i G h = {ii, . . . ,ik}, say) sit 
in dH — and actually span dH, since the data points are assumed to be in 
general position. It follows from the results stated in the previous paragraph 
that D^^\-^), under the assumptions of Theorem 4.2, coincides with the 
intersection of all such halfspaces. 

Letting Sr{n, k,£) := {n — k — £ + l)r + (£ — 1)(t — 1), define then 



(A.4) U: 
where 



TD-Sr{n,k,£)Ta 

\TD-Sr{n,k,£)T:, 



Td:=t + (r - 1) ^ Z^ and Ton := ^ Yl 
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Taking as in Definition 2.1, one of course has T^T^) = Sr{n,k,i)T'^^To 
Hence, writing (a/i,b^)' for the unique solution of 

u'Zi-bT„Zi-a = 0, ieh, 

we obtain 



Yl ("^ ~ I[u'z,-b',r;,z,-ah<o] ) ( r'"^z ■ ) 

ie{l,...,n}\h \ u 1/ 

E (r^V,>(-')E(ri'z,) 



: T 

Zi&H\dH ^ " ZiiH 

Sr{n,k,t)\ _ I , 1 

■p/ rp I — Sr\n, K, I) \ I ' rp 



Since [see (3.10)] 



K \ -■- u ("^ 



this imphes that, with the same notation as in the end of Section 3.1, we 
have 

_ Sr{n,k,£) 
^TuW J, Ifc, 

hence that the subgradient conditions (3.11) are satisfied for any r G [^^, ^^n~^ ] 
It fohows that, for any such r, H coincides with the upper quantile halfspace 
Hr^^ associated with some € n^u^ where u is as in (A. 4), so that 

(A.5) i2W(r):= f] n{Hi':^^} C D^^^^ 

for any positive r G [^^)^); one should indeed avoid the value r = for 
which i?^"^ (r) is not defined as the upper envelope of quantile halfspaces. 
Now, fix r € (0, Then, according to (3.9), all upper sample quantile 

halfspaces ui^^ generating i?(")(T) contain P + Z> [n(l-r)] =n-[nrj > 
n — i + 1 observations. Hence, D^^\^) C R^^\t) for any such r. This, jointly 
with (A.5), establishes the result. □ 

Most interestingly, the proof of Theorem 4.2 actually shows that, for any 
T G (0, 1), the set {vr^u'' : u G 5^^"^, vr^u^ contains k data points} coincides with 
the collection of all hyperplanes passing through k observations and cutting 
off at most [nrj and at least [nr] — k data points. Consequently, as stated 
at the end of Section 5.2, the set of r-quantile hyperplanes in all directions 
provides enough material to compute not only one, but min(fc + [nrJ +1) 
Tukey depth contours at a time, where ijx is one if x is an integer and zero 
otherwise. 
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